Character Encodings and Their Internet Names
Table C-1 lists character encodings for various languages, gives some of their common Internet names, and identifies the version of the Text Encoding Conversion Manager for which character encoding was first supported for use by the Text Encoding Converter and the Unicode Converter. In the last two columns of the table, ÒN/AÓ means that the encoding is not supported.Table C-1 Character encoding Internet names and availability in Mac OS
Character encoding Common Internet names Related information Version of Text Encoding Conversion Manager that first offered support in: Text Encoding Converter Unicode Converter Universal Unicode 2.0 (16 bit) UTF-16
1.2 1.2 Unicode 2.0 UTF-8 UTF-8
1.2 1.2.1 Unicode 2.0 UTF-7 UTF-7
1.2 N/A Unicode 1.1 (16-bit) UNICODE 1-1
1.2 1.2 Unicode 1.1 UTF-8 UNICODE-1-1-UTF-8
1.2 1.2.1 Unicode 1.1 UTF-7 UNICODE-1-1-UTF-7
1.2 N/A Western European languages ASCII US-ASCII
1.2.1 1.2.1 ISO 8859-1 (Latin-1) ISO-8859-1
,latin1
1.2.1 1.2.1 CP 1252 (Windows Latin-1) windows-1252
,cp1252
ISO 8859-1, plus additions in C1 area 1.2 1.2 CP 437
(DOS Latin-US)cp437
1.2 1.2 CP 850
(DOS Latin-1)cp850
1.4 1.4 Mac OS Roman mac
,macintosh
,x-mac-roman
1.2 1.2 Mac OS Icelandic x-mac-icelandic
based on Mac OS Roman 1.2 1.2 Mac OS Latin-1,
Mac OS Mailx-mac-latin1
(commonly sent as ISO-8859-1)Mac OS Roman permuted to align with 8859-1 1.2 1.2 NextStep Latin 1.2 1.2 CP 037 (EBCDIC-US)
cp037
ISO 8859-1 repertoire, different layout 1.2.1 1.2.1 Arabic ISO 8859-6
(Latin/Arabic)ISO-8859-6
,arabic
1.2 1.2 CP 1256
(Windows Arabic)windows-1256
,cp1256
Partly 8859-6, plus C1 additions 1.2 1.2 CP 864 (DOS Arabic) cp864
Encodes Arabic presentation forms 1.2 1.2 Mac OS Arabic x-mac-arabic
1.2 1.2 Mac OS Farsi x-mac-farsi
1.2 1.2 Central European languages ISO 8859-2 (Latin-2) ISO-8859-2
,latin2
1.2 1.2 CP 1250 (Windows Latin-2) windows-1250
,cp 1250
Partly 8859-2, plus C1 additions 1.2 1.2 Mac OS Central
European Romanx-mac-centraleurroman
1.2 1.2 Mac OS Croatian x-mac-croatian
Based on Mac OS Roman 1.2 1.2 Mac OS Romanian x-mac-romanian
Based on Mac OS Roman 1.2 1.2 Chinese GB 2312-80 1.2 N/A EUC-CN GB2312
,X-EUC-CN
ASCII + GB 2312- 80 (8-bit) 1.2 1.2 CP 936
(DOS and Windows Simplified)Similar to GBK 1.4 1.4 Mac OS
Chinese SimplifiedBased on EUC-CN 1.2 1.2 ISO 2022-CN ("GB") ISO-2022-CN
ASCII +
GB 2312-80 (7-bit)
(see RFC1922)1.2 N/A HZ HZ-GB-2312
ASCII + GB 2312-80 (7-bit) (see RFC1842); 1.2 N/A GBK (extended GB) EUC-CN + Unihan repertoire (8-bit) 1.2 1.2 CNS 11643 plane 1 x-cns11643-1
N/A N/A CNS 11643 plane 2 x-cns11643-2
N/A N/A EUC-TW X-EUC-TW
ASCII + CNS 11643-1992 (8-bit) 1.2 1.2 Big-5 Big5
(8-bit) 1.2 1.2 CP 950
(DOS and Windows Traditional)Based on Big-5 1.4 1.4 Mac OS
Chinese TraditionalBased on Big-5 1.2 1.2 CCCII N/A N/A EACC N/A N/A Cyrillic ISO 8859-5
(Latin/Cyrillic)ISO-8859-5
,cyrillic
1.2 1.2 KOI8-R KOI8-R
See Rfc 1489 1.2 1.2 CP 1251
(Windows Cyrillic)windows-1251
,cp1251
Not based on ISO 8859-5 1.2 1.2 CP 866
(DOS Russian)cp866
N/A N/A Mac OS Cyrillic x-mac-cyrillic
1.2 1.2 Mac OS Ukrainian x-mac-ukrainian
Mac OS Cyrillic with two replacements 1.2 1.2 Greek ISO 8859-7 ISO-8859-7
,greek
1.2 1.2 ISO 5428 ISO_5428:1980
N/A N/A CP 1253
(Windows Greek)windows-1253
,cp1253
Nearly 8859-7, plus C1 additions 1.2 1.2 Mac OS Greek x-mac-greek
1.2 1.2 Greek CCITT greek-ccitt
N/A N/A Hebrew ISO 8859-8
(Latin/Hebrew)ISO-8859-8
,hebrew
1.2 1.2 CP 1255
(Windows Hebrew)windows-1255
,cp1255
Mostly 8859-8, plus C1 additions 1.2 1.2 Mac OS Hebrew
(2 variants)x-mac-hebrew
1.2 1.2 Indic ISCII-91 Parallel encodings for all Indic scripts N/A N/A Mac OS Gujarati 1.2 1.2 Mac OS Devanagari 1.2 1.2 Mac OS Gurmukhi 1.2 1.2 Japanese JIS X0208 1.2 N/A JIS X0212 N/A N/A EUC-JP EUC-JP
,X-EUC-JP
JIS 201 + JIS 208 + JIS 212 (8-bit) 1.2 1.4 ISO 2022-JP ("JIS") ISO-2022-JP
JIS 201 + JIS 208 + JIS 212 (7-bit); Rfc 1468 1.2 N/A Shift-JIS Shift_JIS
,x-sjis
,x-shift-jis
JIS 201 + JIS 208 (8-bit) 1.2 1.2 CP 932
(DOS + Windows)Based on Shift-JIS 1.4 1.4 Mac OS Japanese Based on Shift-JIS 1.2 1.2 Korean KSC 5601-1987 1.2 N/A EUC-KR EUC-KR
ASCII + KSC 5601-87 (8-bit); Rfc 1557 1.2 1.2 CP 949
(DOS + Windows)Unified Hangul Code: EUC-KR + Johab N/A N/A Mac OS Korean Based on EUC-KR 1.2 1.2 ISO 2022-KR ("KSC") ISO-2022-KR
ASCII + KSC 5601-87 (7-bit): Rfc 1557 1.2 N/A KSC 5700 N/A N/A Symbols encoding Adobe Symbol Adobe-Symbol-Encoding
N/A N/A Mac OS Symbol x-mac-symbol
Based on Adobe Symbol 1.2 1.2 Mac OS dingbats x-mac-dingbats
Based on Adobe Zapf Dingbats 1.2 1.2 Thai TIS 620-2533 N/A N/A CP 874
(DOS + Windows)cp874
Based on TIS 620-2533 1.4 1.4 Mac OS Thai x-mac-thai
Based on TIS 620-2533 1.2 1.2 Turkish ISO 8859-9 (Latin-5) ISO-8859
,latin5
1.2 1.2 ISO 8859-3 (Latin-3) ISO-8859-3
N/A N/A CP 1254
(Windows Latin-5)windows-1254
,cp1254
1.2 1.2 Mac OS Turkish x-mac-turkish
Based on Mac OS Roman 1.2 1.2 Vietnamese VISCII VISCII
Rfc 1456 N/A N/A TCVN-n N/A N/A